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Abstract 

Modelling excesses over a high threshold using the Pareto or generalized Pareto 
distribution (PD/GPD) is the most popular approach in extreme value statistics. 
This method typically requires high thresholds in order for the (G)PD to fit well 
and in such a case applies only to a small upper fraction of the data. The extension 
of the (G)PD proposed in this paper is able to describe the excess distribution 
for lower thresholds in case of heavy tailed distributions. This yields a statistical 
model that can be fitted to a larger portion of the data. Moreover, estimates of 
tail parameters display stability for a larger range of thresholds. Our findings are 
supported by asymptotic results, simulations and study. 
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1 Introduction 



It is well known that a distribution is in the max-domain of attraction of an 
extreme value distribution if and only if the distributi on of excesses over high 
thresholds is asympt otically generalized Pareto (GP) (IBalkema and de Haan . 
1974 ; iPickandsl . Il975l ) . This result gave rise to the peaks-ov er-threshold meth- 
odology introduced in iDavison and Smith! (Il990l ); see also IColesI ( j200ll ). The 
method consists of two components: modelling of clusters of high-threshold 
exceedances with a Poisson process and modelling of excesses associated to 
the cluster peaks with a GPD. In practice, a way to verify the validity of the 
model is to check whether the estimates of the GP shape parameter are stable 
when the model is fitted to excesses over a range of thresholds. The question 
then arises how to proceed if this threshold stability is not visible for a given 
data set. From a theoretical point of view, absence of the stability property 
can be explained by a slow rate of convergence in the Pickands-Balkema-de 
Haan theorem. In case of heavy-tailed distributions, the same issue arises when 
fitting a Pareto distribution (PD) to the relative excesses over high, positive 
thresholds. 



A possible solution is to build a more flexible model capable of capturing the 
deviation between the true excess distribution and the asymptotic model. For 
heavy-tailed distributions, this devi ation can b e parametrized using a power 
series expansion of the t ail function (IHalll. Il982l). or more generally via sec ond- 



order regular variation (IGeluk and de Haanl . 119871 ; iBingham et all 119871 ). 



The aim of this paper is to propose such an extension, called the extended 
Pareto or extended generalized Pareto distribution (EPD/EGPD). A key dis- 
tinction with other approaches is that although in previous papers the second- 
order approximation is used for adjusting the inference of the tail index, infer- 
ence on the tail itself is still based on the GPD; in contrast, in our approach 
the EP(G)D is fitted directly to the high-threshold excesses. Indeed, as we will 
show later, even if the (G)PD parameters are estimated in an unbiased way, 
tail probability estimators may still exhibit asymptotic bias if based upon the 
(G)PD approximation. 



The main advantages of the new model are a reduction of the bias of estimators 
of tail parameters and a good fit to excesses over a larger range of thresholds. In 
an actuarial context, the releva nce of using more ela borat e models has already 
been discussed for instance in iFrigessi et all (120021 ) and ICooray and Ananda 
(|2005h . 



In case of heavy-tailed distributions, it is more convenient to work with relative 
excesses X/u rather than absolute excesses X — u. Under the domain of at- 
traction condition the limit distribution of X/u given X > u for u — > oo 
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is the PD. The EPD and EGPD presented here are related through the 
same affine transformation that links these relative and absolute excesses. 
Building on the theory of gene ralized regular variation of second order in 



de Haan and Stadtmiillerl (119961 ). it is also possible to construct an extension 



of the GPD with comparable merits applicable to distributions in all max- 
domains of attraction. However, par ameter estimat i on in t his more general 
setting is numerically quite involved (IBeirlant et all l2002bl ): the model con- 



tains one additional parameter and the upper endpoint of the distribution 
depends in a complicated way on the parameters, which complicates both 
theory and computations. 



Bias-r e duction methods ha v e already been propos e d in, amongst others. iFeuerverger and H al] 
(Il999h . [Gomes et all (l2000h.lBeirlant et all (ll999h . lBeirlant et all (l2002aUGomes and Martins 
fl2002h . and lGomes and Martins! (I2004J ). These methods focus on the distribu- 
tion of log-spacings of high order statistics. Moreover, ad hoc construction 
methods for asym ptotically un b iased estima tors of the extrem e value index 
were introduced in iPengl (119981 ). iDreed (119961 ) and ISegerj (120051 ). In contrast, 
next to providing bias-reduced tail index estimators, our model can be fitted 
directly to the excesses over a high threshold. The fitted model can then be 
used to estimate any tail-related risk measure, such as tail probabilities, tail 
quantiles (or value-at-risk), etc. 



In the same spirit as in this paper, a mixtu re model with two Pareto com- 
ponents was proposed in lPeng and Qil (120041 ) . The advantage of our model is 
that it also incorporates the popular GPD. From our experience, this connec- 
tion can assist in judging the quality of the GPD fit; see for instance the case 
study in Example 5.3. 



The paper is structured as follows. The next section provides the definition 
of the E(G)PD, which is shown to yield a more accurate approximation to 
the distribution of absolute and relative excesses for a wide class of heavy- 
tailed distributions. Estimators of the EPD parameters are derived in Section 3 
using the linearized score equations, and their asymptotic normality is formally 
stated. In Section 4, we compare the asymptotic distribution and the finite- 
sample behavior of the estimators of the extreme value index following from 
PD, GPD and EPD modelling. To illustrate how to apply the methodology to 
the estimation of general tail-related risk measures, we elaborate in Section 5 
on tail probability estimation with theoretical results and a practical case. The 
appendices, finally, contain the statement and proof of an auxiliary result on 
a certain tail empirical process followed by the proofs of the main theorems. 
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2 The Extended (Generalized) Pareto Distribution 



Definition 2.1 The Extended Pareto Distribution (EPD) with parameter 
vector (7, 6, r) in the range r < < 7 and 5 > max(— 1,1/r) is defined 
by its distribution function 



0, ify^i. 



The Extended Generalized Pareto Distribution (EGPD) is defined by its dis- 
tribution function 

H-y,s )T (x) = G 7 ,5, r (l +x), x G M. 



The ordinary Pareto Distribution (PD) with shape parameter a > is a 
member of the EPD family: take 7 = 1/a and 5 = (arbitrary r). The 
Generalized Pareto Distribution (GPD) with positive shape parameter 7 > 
and scale parameter a > is a member of the EGPD family: take r = — 1 and 
5 = j/a — 1. Finally, the distribution of the random variable Y is EPD (7, 5, r) 
if and only if the distribution of Y — 1 is EGPD (7, 5, r). 

We will use the E(G)PD to model tails of heavy-tailed distributions that sat- 
isfy a certain second-order condition, to be described next. For a distribution 
function F, write F = 1 — F. Recall that a positive, measurable function / 
defined in some right neighborhood of infinity is regularly varying with index 
(3 G M if lim^oc f{ux)/f{u) = x 13 for all x G (0, 00); notation / G Mp. The 
following definition describes a subset of the class of distribution functions F 
for which F G ^_i/ 7 , 7 > 0. Note that the latter is precisely the class of 
distributions in the max-domain of attraction of the Frechet distribution with 
shape parameter I/7. 

Definition 2.2 Let 7 > and t < be constants. A distribution function F 
is said to belong to the class ^"(7, r) if x 1 ' 7 F(x) — > C G (0, 00) as x — » 00 
and if the function 5 defined via 

F(x) = Cx- lh {\ + 7 _1 5(x)} (2.1) 

is eventually nonzero and of constant sign and such that \8\ G M T . 



Note that \8\ G with r < implies £(x) — > as x — > 00. In many examples, 
the function 5 in Definition 2.2 is actually of the form 5(x) ~ Dx T as x — > 00 
fo r some nonz ero constant D, a class of distributions which was first considered 
in iHalll (119821 ). See Table 1 for examples; for later use, we also list p = 7r (see 



Lemma 2.4 below). 
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distribution distribution function 7 r p = 7T 

[parameters] 



Burr( 7 , p, /?) 1 - (1 + x-P^/l3f^ 7 

[7 > 0, p < 0, p > 0] 



Frechet(a) 
[a > 0] 

GPD( 7 ,a) 
[7 > 0, a > 0] 

Student- t v 

[u>0] 



exp(— x a ) 1/a 
1 - (1 +-/x/a)- 1 ^ 7 

^)/-oo(i + ^)- ( ^ +1)/2 dy V- 



p/7 



-a 



-1 



-7 



-2/i/ 



Table 1 

Extreme value index 7 and second-order constants r one? p = 7T /or selected heavy- 
tailed distributions. 

Let X be a random variable with distribution function F and let it > be 
such that F(u) < 1. The conditional distributions of relative and absolute 
excesses of X over u are given by 

Yi{X/u >y\X>u) = and Pr(X - u > x | X > u) = x) 



F(u) 



F(u) 



for x ^ and y ^ 1. The next proposition shows that for F G =^"(7, r), the 
EPD and the EGPD improve the PD and GPD approximations to these excess 
distributions with an order of magnitude. 



Proposition 2.3 If F e ^(7, r), t/ien as -u — > 00, 



Lr 7 ,5( u ),r(yJ 



sup 

x>0 



sup 

F(u + x) -jj 



F(u) 



7,5(u) 



r (x/u) 



o{|^)|}, 
o{\S(u)\}. 



(2.2) 
(2.3) 



Proof Equation (2.3) follows directly from (2.2) by writing u + x = uy or 
y = 1 + x/u and exploiting the link between the EPD and the EGPD. So let 
us show (2.2). On the one hand, we have 

= y -V7l±7^M = -i/ 7 ( x _ -i^ 
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On the other hand, since ^ 1 — y T ^ 1 for y ^ 1 and since 6(u) — > 0, 

[ y {l + 5{u)-5{u)y T }\- 1 ^ 

= y- 1 ^{l-7- 1 J(ii)(l-y T )} + o{|tf(«)|}, 



u — ► oo, 



uniformly in y ^ 1. As a consequence, 

l[mn W + *(«) - Ku)y T }r lh 



F{u) 

I _ S(uy) 

-j^y-^u) ( i - ^ {u) - (1 - ^ ) + o{|5(n)|}, 



"U — ► oo, 



uniformly in y ^ 1. The asymptotic relation (2.2) now follows from the uni- 
form convergence theo rem for regularly varying functions with negative index 



(jBingham et all 119871 . Theorem 1.5.2). □ 



If in (2.2) we would replace the EPD tail function G lt s(u),r{y) by the PD tail 
function y~ 1//7 , the rate of convergence would be 0{|5(u)|} only. Similarly, if in 
(2.3) we would replace the EGPD tail function H lt $u 1 Y T (x/u) by the GPD tail 
function (1 + 'jx/a)^ 1 ^ 1 for some a = <t(m), then, provided r ^ —1, the rate 
of convergence would again be 0{|<5(m)|} only. If r = —1, the EGPD is just a 
reparametrization of the GPD, so that in that case, the GPD approximation 
is already of the order o{|5(w)|}. 

It will be useful to rephrase our second-order assumption on F in terms of the 
tail quantile function U defined by 

U(y) = Q(l - l/y) with Q(p) = inf{x G R : F(x) ^ p}, (2.4) 

where y G (1, oo) and p G (0, 1). Note that U is a (generalized) inverse of 1/F. 

Lemma 2.4 If F e «^(7,r) with lim^^ x lh F{x) = C G (0,oo), then 
lim^oo y~ 1 U{y) = C 1 , and the function a defined implicitly by 

U(y)=C^{l + a(y)} (2.5) 

satisfies a(y) = 5{U{y)){l + o(l)} = <5(C 7 y 7 ){l + o(l)} as y — > oo, with 5 as 
in (2.1). 

In particular, a is eventually nonzero and of constant sign and \a\ G & p with 
p = 7r < 0. In addition, even if F is not continuous, then still yF(U(y)) = 
1 + o{\a(y)\} as y -> oo. 
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3 Parameter Estimation 



Our aim is to make inference on the distribution function F on the region to the 
right of some high, positive threshold u. To this end, we assume F G ^(7, t) 
and rewrite (2.2) as follows: as u — > 00 and uniformly in y ^ 1, 

= F( M )G 7i(5(u) , r (y) +o{F(«)|<J(«)|}. (3.1) 

Omitting the remainder term leads to an approximation of F(x) for x ^ w in 
terms of .F(m) and the EPD parameters (7, S(u), r). Replacing these unknown 
quantities by estimates then yields our estimate for F(x). 

The purpose of this section is to construct estimators of the E(G)PD parame- 
ters (7, 8(u),t). As usual in extreme value statistics, the threshold exceedance 
probability F(u) will be estimated nonparametrically. Although the arguments 
leading to the estimators will be of a heuristic nature only, the asymptotic be- 
haviour of the estimators will be stated and proved rigorously. 

Let Xi, . . . ,X n be a random sample from F. In view of (2.2), the estimates 
of the EPD parameters will be based on the relative excesses Xi/u over u, for 
those i G {1, . . . , n} such that Xi > u. In an extreme value asymptotic setting, 
the threshold u needs to tend to infinity to make the approximation valid; at 
the same time, in a statistical context, the number of excesses over u must 
be sufficiently large to make inference feasible. Denoting the order statistics 
by X 1:n ^ • • • ^ X n:n , we can ensure both criteria to be met by choosing a 
data-adaptive threshold u = u n = X n _ k . n where k = k n G {1, . . . , n — 1} is an 
intermediate sequence of integers, that is, k — > 00 and fc/n as n 00. 
For convenience, assume F(0) = 0, so that all Xi are positive with probability 
one. 

Recall the tail quantile function U in (2.4) and the auxiliary function a in 
Lemma 2.4. In addition to k being an intermediate sequence, we will assume 
that 

Vka(n/k) -> A G R, n -> 00. (3.2) 
Writing 5 n = 5(u n ) = 5(X n _ k . n ), we will show later that (3.2) implies 

Vk5 n = A + o p (l), n — > 00. (3.3) 

Since in the definition of the EPD the term x T is multiplied by 5, the previous 
display implies that the asymptotic distribution of tail estimators based on 
(3.1) will not depend on the asymptotic distribution of the estimator of r, 
not even on its rate of convergence. Therefore, we will assume for the moment 
that r (or p) is known. In the end, the unknown second-order parameters 
will be replaced by consistent estimators, a substitution which will be shown 
not to affect the asymptotic distributions of the other estimators. Note that 
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under the regime Vk\a(n/k)\ — ► oo as n — > oo, which will not be considered 
in this paper, the asymptotic distribution of the estimator of the second-order 
parameter does play a role. 

The estimators of 7 and 5 n will be found by maximizing an approximation to 
the EPD likelihood given the sample of k relative excesses X n - k +i:n/ X n - k:n , 
i E {1, . . . , k}, over the random threshold X n _ k . n . The density function of the 
EPD is given by 

g lM {x) = V^l + 5(1 - x^r^il + 6{1 - (1 + t)x t }}. 

The score functions admit the following expansions in 5 —>■ 0: 

^- \ogg lA A x ) = -- + ^ logx + ^(1 - x T ) + 0(5 2 ), 
07 7 7^ 7 Z 

9 1 

— log^ 7i5iT (x) = -{(1 - JT)X T - 1} 

+ {1 - 2(1 - 7 r)x r + (1 - 2 7 r - 7 r 2 )x 2r }- + 0(S 2 ). 
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Define 



1 k 

Hk,n — TZ^^og(X n _ k+i , n /X n _ k:n ), (3.4) 
1 fe 

Ek, n {s) = — y2{X n _ k+i . n / X n _ k . n ) s , s ^ 0. (3.5) 



Note that H kn is the Hill estimator (IHilll . 119751 ). Assume for the moment 
that r is known. Given the sample of excesses X n _ k+i:n / X n _ k:n , % = 1, . . . , k, 
solving the linearized score equations yields the following equations for the 
pseudo-maximum likelihood estimators for 7 and 5: 

lk,n = H^ n + 5fc,n{l - -Efc,n(r)}, (3.6) 

(% n T - l)E Kn {r) + 1 = {1 - 2(1 - %, k r)E kjn (T) 

+ (1 - 2% n T - 7(c,„r 2 )£ fcin (2r)}4 i „. (3.7) 

Substitute the expression for % >n in (3.6) into the left-hand side of (3.7) and 
solve for 5 k)Tl to get 

~ {H k , n T-l)E ktn (r) + l H Kn r-l( 1 \ 

Ok,n = ~ = p: E k>n (T) - -— - , 

u k,n ^k^n \ — tt k n T ) 

the denominator being 

D k>n = 1 - 2(1 - % hk T)E k . n (T) + (1 - 2%, n r - % }n T 2 )Ek } n(2r) 

-T{l-E k , n {r)}E Kn {r). 
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By (3.3), 8 k>n can be expected to be of the order O p (k^ 1 ^ 2 ) as n — > oo. This 
justifies the following simplifications. Since the distribution of relative excesses 
over a large threshold is approximately Pareto with shape parameter I/7, for 
s ^ 0, 

E k , n (s) = - + O p (l), 



1 — 7S 



n — >• 00; 



see Theorem A.l. Hence, writing p = jt, we have E kjn (r) = (1 — p) 1 + o p (l) 
and E k ^ n {2r) = (1 — 2p) -1 + o p (l) as n -> 00, so that 



A, 



P 



7 (l-2p)(l-p)2 + ^ (1) ' 



n — > 00. 



This leads to the following simplified estimators: 



5 Kn = H k>n {l - 2p)(l - pYp- 4 \E kjn (r 

'Jk.n H kn 0~ k ,n 



1 - H k „T 



1-p 



Up to now we have assumed that p is known. Let p n be a weakly consis - 
tent estimator seq u ence of p = tt; see for instance iFraga Alves et al.l (j2003al ) 
Fraga Alves et al.l (j2003bl ). and IPeng and Qil (12004 ) . Replace r, which is un- 
known, by fk, n = p n /H kt7l , to finally get 



h,n = H Kn (l - 2p„)(l - p n ) 3 p n 4 E k;n (p n /Hj 



1 



<fc,nJ 



1 - Pn, 



Ik;, 



H k ,n — S. 



Pn 



k ,n 



1 - Pr, 



(3.8) 
(3.9) 



Further, put 



J k ,n 



Vk{nF{X n _ k ._ n )/k-l). 



(3.10) 



The joint asymptotics of Z kjU with (j kjn , 5 k}Jl ) will become relevant in Section 5 
when estimating tail probabilities on the basis of (3.1) with u = X n __ k , n . Let 
the arrow ~> denote convergence in distribution. 

Theorem 3.1 Let F 6 ^"(7, r) and let X\, . . . ,X n be independent random 
variables with common distribution function F. Let k = k n be an intermediate 
sequence satisfying (3.2). Recall 5 n = 5(X n ^ k - n ) and Z k , n in (3.10). If p n = 
p + o p (l) as n — > 00, with p = 777 then yk5 n = A + o p (l) as n — > 00 and 



fi — ► OO, 



(3.11) 
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a trivariate normal distribution with mean vector zero and covariance matrix 



( T 2 d-P) 2 

1 p 2 



„,2 (l-2p)(l-p) Q \ 



7 

V 



2 (l-2p)(l-p) ,, 2 (l-2p)(l-p) 2 g 



(3.12) 



An asymptotic confidence interval for 7 of nominal level 1 — a is given by 

1 — f>n Za/2 \ * 1 — Pn 



7fc,n 1 + 



7fc,n 1 



Pn Vk ) ' 'V \/& , 

with z Q/ <2 the 1 — a/2 quantile of the standard normal distribution. 



(3.13) 



The proof of Theorem 3.1 is given in Appendix B. It is based on a functional 
central limit theorem for a certain tail empirical process, stated and proved 
in Appendix A. Note that the asymptotic distribution of p\ ra is unimportant; 
the only requirement is that the estimator is consistent for p. 

The fact that the limit distribution in (3.11) is centered for any A, is important 
for two reasons: 

1 It makes possible the use of larger k and thus of lower thresholds compared 
to when the mean would be proportional to A. In this way, the model can 
be fitted to a larger fraction of the data, leading to a reduction of the 
asymptotic variances and thus of the asymptotic mean squared errors of 
the parameter estimates. 

2 Sample paths of the estimates as a function of k will exhibit larger regions of 
stability around the true value. As a consequence, the choice of k becomes 
easier. 

These issues will be illustrated in the simulations in Section 4 and in the case 
study in Example 5.3. 



4 Comparison of Extreme Value Index Estimators 



Under the conditions of Theorem 3.1, we have 



Vk{%n-l) ^iV(0, 7 2 ^- 



n — > 00. 



(4.1) 



According to iDreed (119981 ). the asymptotic variance is minimal for scale- 
invariant, asymptotically unbiased estimators of 7 of a certain form. The limit 
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distribution in (4.1) corresponds wit h the one of the estimators i nlBeirlant et al 
fll999h . lFeuervereer and Halll (1l999h and iGomes and Martini d2002h . 



The maximum likelihood estimator for 7 arises from fitting the GPD to the 



excesses X, 



111 



n.—k-i- 



-X, 



■n—k.-.n.i 



1, . . . , k. Its asymptotics have been studied 



Smith! fll987h . iDrees et all fl2004h and Ide Haan and Ferreiral fl2006l . Theo- 



rem 3.4.2). From the latter theorem, it follows that under the conditions of 
our Theorem 3.1, we have 



v^(C D -7)-Ar(A6( 7 ,p),(l + 7) 2 ) 



n 



00, 



(4.2) 



where 



Kl,P) 



p(l+ 7 )( 7 + p) 



7 (l-p)(l+7-p) 

Comparing (4.1) and (4.2), we see that if r = — 1 and thus p = —7, the 
asymptotic distributions of 7fc jn and 7^ D coincide. This is in correspondance 
with the fact that the EGPD with r = — 1 is a reparametrization of the GPD 
and the fact that the EPD estimators were obtained by solving the linearized 
score equations. 



Finally, under the conditions of Theorem 3.1, the asymptotic distribution of 
the Hill estimator is 



y/k(H k n — 7) -w iV [ A : 



P 



P 



■ 1 



n 



00: 



(4.3) 



see for instance Theorem A.l below. Of the three estimators considered, the 
Hill estimator has the smallest asymptotic variance. Unless A = 0, however, its 
asymptotic bias is never zero. The asymptotic distribution of the Hill estimator 
and i t s opti mal variance proper t y are of c ourse well known; see for instance 
Reissl fll989l . Section 9.4), Gel Jl998h and lBeirlant et all (l2006h . 



To illustrate the behavior of the three estimators, we generated samples from 
four different distributions. For each distribution, we generated 10, 000 samples 
of size n = 1, 000 and computed the three extreme value index estimators for 
k up to 500. For the EP D estimator, we est i mated the second-order parameter 
p using the estimator in iFraga Alves et al.l (l2003bl ). For each distribution and 
each estimator, we computed Monte Carlo estimates of the bias, variance and 
mean squared error by averaging out over the 10, 000 samples. 



Comparing the asymptotic results to the graphs in Figures 1-2 we learn the 
following: 

Frechet distribution with a = 1. We have 7 = 1/a = 1, r = —a = —1, and 
p = 7T = —1. From (4.1) and (4.2), it follows that the asymptotic distribu- 
tions of the EPD and the GPD estimators coincide, with zero asymptotic 
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bias and an asymptotic variance of 4/k. The Hill estimator has an asymp- 
totic variance oil/k only, but its asymptotic bias is nonzero. 
Student t distribution with v — 4. We have 7=1/1/ =1/4, r = —2, and 
p = 7T = —1/2. The asymptotic variances of the three estimators are <r 2 /A; 
wither 2 = 7 2 = 1/16 for the Hill estimator, a 2 = 7 2 (l-p) 2 /p 2 = 9/16 for the 
EPD estimator, and a 2 = (1 + 7) 2 = 25/16 for the GPD estimator. Of the 
three estimators, the EPD estimator is the only one which is asymptotically 
unbiased. 

Pareto mixture distribution defined by F(x) = (1 + c) _1 :r _a (l + cx~ a ), x ^ 1, 
with shape parameter a = 2 and mixing parameter c = 2. We have 7 = 
1/a — l/2,r — —a = —2, and p = 7T = —1. The weight of the second-order 
component is equal to c = 2 times the weight of the first-order component, 
inducing a severe bias to the Hill and GPD estimators; the EPD estimator is 
much less affected by this. The asymptotic variances of the three estimators 
are a 2 / k with a 2 = *y 2 = 1/4 for the Hill estimator, a 2 = 7 2 (1 - p) 2 /p 2 = 1 
for the EPD estimator, and a 2 = (1 + 7) 2 = 9/4 for the GPD estimator. 

Loggamma distribution with shape parameter a = 4 and scale parameter f3 = 
2. Although this distribution has positive extreme- value index 7 = 1 //5, it is 
not in any of the classes ^(7, r), since F(x) ~ constant x ^"^(loga;) 0-1 . 
Nevertheless, the EPD estimator performs reasonably well when compared 
to the Hill and GPD estimators. 



5 Tail Probability Estimation 

Let us return to the tail estimation problem raised in the beginning of Sec- 
tion 3. Given the order statistics X\ :n ^ • • • ^ X n:n of an independent sample 
from an unknown distribution function F G ^(7, r), we want to estimate the 
tail probability p n = F(x n ), where x n — > 00 and thus p n — > as n — > 00. As 
before, let k — k n e {1, . . . , n — 1} be an intermediate integer sequence, that 
is, k — > 00 and k/n — > 0. Assume that p n = F(x n ) satisfies 

npn/k — > g e [0, 1), n — > 00. (5.1) 

Let 7 n , <5„, and f n denote general estimator sequences and put 5 n = 5(X n -k-.n) 
as well as 

r fe , n = Vk(% - 7) and A fc) „ = <J„). (5.2) 

Recall Zfc >n in (3.10) and assume that 

f„ = r + o p (l) and (r fe) „, A fcj „, Z fc>n ) -w (r, A, Z), n -> 00, (5.3) 

a trivariate random vector. A possible choice for the estimators of 7 and S n 
are the ones studied in Theorem 3.1. However, we will formulate our results so 
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Frechet, a = 1 ; n = 1 000 Student f, v = 4; n = 1 000 




k k 

Fig. 1. Variance (top), bias (middle), and mean squared error (bottom) of the Hill 
(dashed), GPD (dotdashed), and EPD (solid) estimator in case of the unit Frechet 
distribution (left) and the Student t distribution with v = 4 degrees of freedom 
(right). The sample size was n = 1,000 and the plots were obtained by averaging 
out over 10, 000 samples. 
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Fig. 2. Variance (top), bias (middle), and mean squared error (bottom) of the Hill 
(dashed), GPD (dotdashed), and EPD (solid) estimator in case of a Pareto mixture 
distribution (a = 2, c = 2; left) and a Loggamma distribution (a = 4, (5 = 2; right). 
The sample size was n = 1,000 and the plots were obtained by averaging out over 
10, 000 samples. 
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as to allow for general estimator sequences satisfying (5.3). For the estimator 
of r, one can fo r instance take f n = pr,.ll v where p n is an estimator of p = jt, 



see for instance iFraga Alves et al.l (j2003bl ). As in Theorem 3.1, the asymptotic 



distribution of f n plays no role. 

Omitting the remainder term in (3.1) and replacing the unknown quanti- 
ties F(u) and (7, <5(m), r) at the random threshold u = X n _ k:n by k/n and 
(ln/dn,T n ), respectively, yields the estimator 

^ k 

Pk,n = F n (x n ) = —G~ n ^ n ^ n (x n /X n -k:n)- 

In the same way, one can construct estimators for other tail quantities: return 
levels, expected shortfall, etc. For brevity, we focus here on tail probabilities. 

In order to describe the asymptotics of pk, n , we need to make a distinction 
between the case < q < 1 in (5.1) and q = 0. The proofs of the follow- 
ing two theorems are to be found in Appendix C. Results for tail probabil- 
ity estimators based o n the PD and GPD approximations can be found in 



de Haan and Ferreiral (120061 . Section 4.4). 



Theorem 5.1 Let F e ^(7, t), let k n be an intermediate sequence satisfying 
(3.2) and let p n be such that (5.1) holds for some < q < 1. If (5.3) ; then 

y/klE^L _ 1) ^ _ 7 -i r l og q - 7-^(1 -q-P) -Z, n^oo. (5.4) 
V Pn ) 

Theorem 5.2 In Theorem 5.1, if (5.1) is replaced by 

np n /k-^0 and log(np n )/Vk — > 0, n — > 00, 

then 

( Pk,n 



Vk 

\og{k/(np n )} \ p r , 



1 -w 7 1 r, 



n — > 00. 



For the EPD estimators % = % >n and 5 n = 6~^ n , Theorems 3.1 and 5.1 lead 
to 

y/kftlL-U ^ N{0,a 2 {q,p)), n^oo, (5.5) 

\Pn ) ' 

with asymptotic variance given by 
o- (?, P) = (log q) 5 + 



p 2 \ P J P 2 



_ 21ogW 1^2(i^MW) + 1 . 

P P 
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The importance of the fact that the limit distribution in (5.5) has mean zero 
was already discussed after Theorem 3.1. An asymptotic confidence interval 
of nominal level 1 — a is given by 



p n [ 1 - a(q n , p n )^0r) , p n \ 1 + o(q n ,p n )^j= 



Vk 



(5.6) 



where q n = np n /k n and with z a / 2 the 1 — a/2 quantile of the standard normal 
distribution. 



If we simply define 5 n = 0, then Afc jn = —\fk~5 n in (5.2) and thus A = 
—A in (5.3 ). The tail proba bility estimator p n then reduces to the Weissman 
estimator (jWeissmanl . Il978l ) 



-w k ( x n 

Pn = ~ 



-1/% 



n \ X, 



n—k:n , 



(5.7) 



Theorem 5.1 then implies 



y/k 



K Pn 



1 ~* -7 _1 r log q + 7 _1 A(1 - q~ 1T ) 



n — > oo. (5i 



For instance, if we estimate 7 by the Hill estimator, then in view of Theo- 
rem A.l, 



Vk 



\np n \X n _ k . n ) 



- 1 



-w N f —A- I — 



7 



1 i log q x 



+ 



,l + (log?) 2 



n — > 00. 



Even if the extreme value index estimator 7„ is such that the asymptotic 
distribution of T n has mean zero, then still the asymptotic distribution (5.8) 
of the Weissman estimator will have a mean which is proportional to A. In 
other words, unbiased tail estimation requires more than unbiased estimation 
of the extreme value index alone. 

From Theorem 5.2 and its proof, we learn that for estimation of tail probabil- 
ities p n of smaller order than k/n, the difference between the Pareto approxi- 
mation and the EPD approximation does not matter asymptotically. Still, for 
p n to be an asymptotically unbiased estimator of p n , the estimator % needs to 
be asymptotically unbiased for 7. For instance, if we use the EPD estimator 
7 fc) „, then 

[Pn ,\ .J, (1-Pf \ 

AMO, — , n — > 00. 



\og{k/(np n )} \pn J I ' P 2 
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Fig. 3. Trajectories of estimates of 7 (left) and of the exceedance probability over 
€7 million (right) for the Secura Belgian Re data in Example 5.3. 

Example 5.3 The Secura Belgian Re data in lBeirlant et al.l (120041 . Section 1.3.3) 
comprise 371 automobile claims not smaller than €1.2 million. The data span 
the period 1988-2001 and have been gathered from several European insurance 
companies. Figure 3 shows the estimates of 7 (left) and of the probability of a 
claim to exceed €7 million (right). Nominal 90 % confidence intervals for the 
EPD estimates are added too, see (3.13) and (5.6). In the data-set, there were 
actually 3 exceedances over €1.2 million, yielding a nonparametric estimate 
of 3/371 = 0.81%. In comparison to the Weissman (Hill) and POT (GPD) 
estimates, the trajectories of the EPD estimates are relativel y stable, with 



■ 7 arou nd 0.3 and p around 0.75%. By way of comparison, in iBeirlant et al. 
(120041 . Section 6.2.4) it is suggested to model the complete distribution by a 
mixture of two components, an exponential and a Pareto distribution, with 
the knot at about €2.6 million, which corresponds to the order statistic X n - k :n 
with k = 95. Although this knot is detected by the EPD estimator, it does 
not cause the tail parameter estimates to change dramatically. 



A Tail Empirical Processes 



Recall H kn , E k ^ n (s) and Z kjTl from equations (3.4), (3.5) and (3.10), respec- 
tively, and define 

r k>n = Vk(H k , n - 7), (A.l) 
E fc ,„(s) = Vk (E k>n (s) - ' s < °- ( A ' 2 ) 

Our proof of Theorem 3.1 will be based on the fact that (I\ n , Efc jTl , Z k>n ) 
converges weakly in the space R x "Jf [s , 0] x R; here sq < and ^[a,b] is 
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the Banach space of continuous functions / : [a, b] — > R equipped with the 
topology of uniform convergence. Of course, the asymptotic distribution of the 
normalized Hill estimator V k>n has been established in numerous other papers; 
in the following theorem, it is the joint convergence which is our main concern. 

Theorem A.l Let F E ^(7, r). If k = k n is an intermediate integer se- 
quence satisfying (3.2), then for every so < 0, in R x ^[— so, 0] x M., 

(r fc) n,E fein , Z kjn ) ~» (r,E, Z), n -> 00, 

a Gaussian process with the following distribution: Z is standard normal and 
is independent of (T, E), and for s, s 1; s 2 E [sq, 0], 

EpE(s)] = A- S A-, r, E[r] = A— ^— , 

[ 1 ,] (i- S7 -p)(i- S7 )' M 

cov{E( Sl ), E( S2 )} = SlS f var(r) = 7 2 , 

cov{r ' E(s)}= (T^F- 



Proof Let Yi, Yi, Y 2 , Y 2 , • • • be independent Pareto(l) random variables. For 
positive integer k, denote the order statistics of Y±, . . . , Y k by Yi± < ■ ■ • < Y k:k ; 
also, let Y 0:k = 1. Similarly, denote the order statistics of Yi, . . . , Y n by Y\. n < 

■ ■ ■ < Y n . n . Then the following three vectors are equal in distribution: 

(X n ^ k+im : i = 0, . . . , k) = (U (Y n _ k+i:n ) : i = 0, . . . , k) 

±(U(Y i:k Y n „ k:n );i = 0,...,k). 

Since we are only interested in the asymptotic distribution of (I\ n , E fc n , Z kjn ), 
we may without loss of generality assume that actually 

{X n _ k+i , n :i = 0,...,k) = (U {Yi, k Y n ^ k , n ) : i = 0, . . . , k). 

The following property is well-known: if k is an intermediate sequence, then 

Vk{{n/k)Y-} kln - 1} - N(0, 1), n - 00. (A.3) 

[A quick proof is to employ the distributional representation Y n _ k . n = (E\ + 

■ • ■ + E n+ i) I (Ei + . . . +E k ), with Ei, . . . , E n independent standard exponential 
random variables.] As a consequence, we have Y n _ k , n = (n/k){l + o p (l)} as 
n — > 00, and therefore, by (3.2) and the Uniform Convergence Theorem for 
M p ( teingham etalL Il987l . Theorem 1.5.2), 

Vka(Y n - k :n) = \^a(n/ k) a ^ n ~ k u ^ = A + o p (l), n — >■ 00. (A.4) 
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Since a(y) ~ 5(U(y)) as n — > oo, this also shows that y/kS(X n -k:n) = A + o p (l) 
as n — > oo. 

In the next three paragraphs, we will analyse the components I\ n , E fc n and 
Zfc,n separately. In the fourth and final paragraph, these analyses will be com- 
bined. 

1. The component T kn . Let the function a be as in (2.5) and define r)(y) = 
log{l+a(y)}. Since lim^oo a(y) = 0, we have r)(y) = a(y){l+o(l)} as y — > oo, 
and hence 

Vkr)(Y n - k :n) = A + o p (l), n -> oo. (A. 5) 

In particular, 77 is eventually nonzero and of constant sign, and |r/| £ <^ p . We 
have 

1 k 

Hk,n = tX! 1°S X n ^ k+i:n — logX n _ fc:n 
ft i=l 

1 k 

= rElogC/^n-^n) " l0gf/(F n _ fc:n ) 
ft i=l 

1 fc 

fc i=l 

As a consequence, 

r fc ,„ = y/k(H ktn - 7) 



7 v- n v 1\ 1 fT. (v \ 1 V- ( viYjYn-k-.n) 

= -ft 2j lo g F i - 1 + V^ Vfc:n 7L ~ 7^ T 

V/C i=1 fc i=1 V V{Y„-k:n) 

By the Uniform Convergence Theorem for M p , for every Xq > 0, 

77 (xy) 



- 1 



lim sup 



v(y) 



0. 



By the last two displays and in view of (A. 5), 



max 

i=l,...,k 



T]{'YiY n — k:n ) yp 



o p (l), n — > 00. 



By (A. 5) and since k" 1 Yh=i Yf = (1 — p) _1 + o p (l) as k — > 00, we find 



(A.6) 



r 



7 



£(io g >w) + a (t^- 1 )+°p( 1 )' 



n -> 00. (A. 7) 



2. The component K kjn - Recall the notation r)(y) = log{l + a(y)}, so that 
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U(y) = C^y 1 exp{rj(y)}. We have 



E k ,n( S ) = tJ2 



'X. 



n—k+i:n 



^ i=l \ ^ {Xn-k:n) 



- Yi* exp[s{rj(YiY n _ k . n ) - r](Y n ^ k . n )}}. 
K i=i 



Writing e itn = r)(YiY n _ k:n ) / 'r)(Y n - k:n ) - Yf, we find 



E k>n {s) = - £ y, 7S exp{^(y n _ fe:n )(F/ - 1 + e i>n )}. 



Recall the elementary inequality \e z — 1 — z\ ^ {z 2 /2) max(e z , 1), z G R Since 
< Y-< s ^ 1, < Yf < 1 and max i=1 n l^inl = o p (l) [see (A. 6)], we get by 
(A.5), 



sup 

sG[s ,0] 



E k , n (s) + s V (Y n _ k:n )(Yf - 1)} 

i=i 



o P {k(K-fc:n)|} = o P (^ 1/2 ), n 



oo. 



For # < 0, the class of functions {fg : 9 G [9 , 0]} from [1, oo) to (0, 1] defined 
by fe(y) = y e , y ^ 1, satisfies the Glivenko-Cantelli property 



sup 

0e[0o,o] 



^fe 1 1- 



o p (l), fc -> oo; 



see for instance Example 19.8 in Ivan der Vaartl (Il998h or just use the mono- 
tonicity and continuity of y 9 in 9. In view of (A.5), we obtain 



sup 

se[s ,o] 



E k Js) 



lltrr - s V (Y n „ k:n ) (- — - 



7s — p \ — r ys / 
o p {\rj(Y n _ k:n )\} = o p (k~ 1/2 ), n -> oo. 



Using (A.5) again, we find 

Efc,n(s) = V 7 ^ ^fc, n (s) - - 



sA 



7s 



1 — 7s — p 1 — 7s 



+ £ n (s) 



with 



sup |e n (s)| = o p (l), n -> oo. 
se[s ,o] 



(A.8) 
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3. The component Z k ^ n . By (2.1) and (2.5), we find 

yF(U(y)) = l + o{\a(y)\}, y - oo. 

As a consequence, 

F(X n „ k .. n ) = F(U(Y n _ k:n )) 

= tUl+ 0)) {|fl(t fe :J}] 

= ^fc»{l+Op(*~ 1/a )}. 
where we used (A. 4) in the last step. We obtain 

Z k , n = Vk{(n/k)F(X n ^ k:n ) - 1} 

= Vk{(n/k)Y-\ n -l} + o p (l), rwoc. (A.9) 

4- Joint convergence. Define 

k 

f* = -7r£0°glW), 
vfc i=1 



1 A I 



For 6*o < 0, the class of functions {/# : 6* G [#o,0]} defined by = y e , 

y ^ 1, is Donsker wi th respect to the Pa reto(l) distribution; this follows 



from Example 19.7 in Ivan der Vaartl fll998h upon noting that \y 01 — y° 2 



\9i — 9 2 \ logy for 6i ^ 0, # 2 ^ and y ^ 1. As a consequence, in R x ^[s , 0], 

(f fc ,E fe )~»(f,E), fc^oo, (A.10) 
a centered Gaussian process with covariance function 
varT = var(7logY"i) = 7 2 , 

cov{E( Sl ),E( S2 )} = cov(lT\n 7S2 ) = 1 " jz \r, y 

I-S17-S27 (1 - si7)(l - S27) 



cov{f, E(s)} = cov( 7 log Y h Y? s ) - 



2 



(1 - s 7 ) 2 ' 

By (A.7) and (A.8), it follows that inlx *f[s o ,0], 

(r fc>n , E fci „) -w (r, E), n -> 00. 

Finally, from (A. 3) and (A.9) it follows that (T kin ,K kj1l , Z kj1l ) (T,E, Z) as 
n — > 00, where Z is standard normally distributed and is independent of 
(r,E). □ 
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B Proof of Theorem 3.1 



The fact that ^/k5(X n _ k:n ) = A + o p (l) as n — > oo has already been shown in 
the proof of Theorem A.l; in particular, see (A. 4). Recall I\ n and E fejn (s) in 
equations (A.l) and (A. 2), respectively, and write Tk, n = Pn/Hk,n- We have 

Vk (^fe,n(Tfc >n ) - 1 ; ) 
\ L Pn J 

= Vk [E kjn (r k ,n) - : 1 . ) + (- ^— — ^r- ) 

= Efcn(ffcn) 77 77 — "7 777 777 ;prTfc ra- 

By Theorem A.l, = 7 + k~ 1/2 T k ^ n = 7 + o p (l) and thus f fejTl = r + o p (l) 
as n — > 00. It follows that 

Substituting this into the definition of 5 k)n yields 

Vk5 k ,n = 7(1 " 2p)(l - p) V 4 (E fe ,n(r fc ,„) - 7(1 _^ p)2 r fc , n ^ + o p (l), 

n — > oo, (B.l) 



n — ► oo. 



as well as 



\ 1 P" 



-7 



( r fe , n - 7- — Efc.nfa.n) J + o p (l), n -> 00. 



,2 



(B.2) 

From ffc ;n = r + o p (l) and Theorem A.l, it follows that 

(r fejTl , E fc>n , f„ )fc ) -w (r, E,Z,t), n — >■ 00. 

For s < r, we have Pr(s ^ T Ujk ^ 0) — > 1 as n — > oo, and thus, by the 
previous display and the continuous mapping theorem, 

(r fcn , E fc n (f fc)n ), Z fc n ) (r, E(r), Z), n -> oo. 
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In view of (B.l) and (B.2), as n — > oo, 

,ni ^k,nj 

:i-p) 2 l i-2p 



, ,r- 7 -e(t) , 

p 2 \ p 



(l-2p)(l-p) / r , (1-p) 



2 



P 3 



-r + 7 ^ ^E(r) ,Z . (B.3) 



The vector (r, E(r),Z) is trivariate normal, with Z standard normal and in- 
dependent of (r, E(t)), with T as in Theorem A.l, and with 

2 2 

E[E(r)] = A— -, var{E(r)} - 9 



7 (l-2p)(l-p)' v n (l-2p)(l-p)2' 

cov{r,E(r)} = 7 --^. 

As a consequence, the distribution of the limit vector in (B.3) is trivariate 
normal with mean vector (0, A, 0)' and covariance matrix £ as in (3.12). 



C Proofs for Section 5 

Proof of Theorem 5.1 Put y n = x n /X n ^ k:n , recall 5 n = <5(X n _ fc:ri ), and 
define 

Pn = F{X n ^k:n)G 1) S n ,T{yn)- 

Since k — > oo and p n — > as n — > oo, it is sufficient to prove (5.4) with 
Pn/Pn — 1 replaced by log]3 n — logp n . Let us write 

Vk(logp n ~ l0gp n ) = >/fc(logp n - logp n ) + Vk(\ogp n - hgp n ) 

and treat the two terms on the right-hand side separately. 
1. The term \^k(\ogp n — logp n ). We have 

logp n - logp n = XogG^AVn) - log Eyp^zlE^ , 

f \A-n-k\n) 

Since (n/k n )F(y n X n _ k:n ) -> g and (n/A; n )F(X n _ fc:n ) = 1 + o p (l) as n ->• oo, 
F(y n X n _ fc:n ) F(x n ) 



f(^n-/i:n) F(X n -k:n) 



g + o p (l), 



n — >■ oo. 



Since moreover F is monotone and regularly varying of index — 1/7, this forces 
y n — > y as n — > 00 with y^ 1 / 7 = q, or y = g -7 G (1, 00). By Proposition 2.3, 
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we find 

logp n -logp n 

= Op(l), n ^ oo. 

On 

Finally, from \fk8 n = A + o p (l) as n — > oo, we can conclude that 

Vk(logp n - log p n ) = Vk5 j° g ^ n l0gP " = Q p (l), n -> oo. (C.l) 

£ TTie term v^logPn — logp n ). We have 

logp n - logp„ = {log(fc/ra) - logF(X n _ fe;n )} 

+ {logG^ Xjfn (y n ) - log G^AVn)}- (C2) 

The first term on the right-hand side is 

log(fc/n) - logF(X n _ fc: „) = -log{nF(X n _ k:n )/k} 

= -log(l + A;- 1 / 2 Z n ) 
= -Ar^^ + o^Ar 1 / 2 ), n -> oo. 

For the second term on the right-hand side in (C.2), we proceed as follows. 
Since y n = y + o p (l) as n — > oo and 3/ > 1, it is sufficient to work on the event 
y n > 1. Then 

= log[{y n (l + 5 n - ^yj-)}- 1 /**] - log[{y„(l + 5 n - «)}-^] 
= (4)-(-i))log^ 

+ (-— ){log(l +L- Lytr) - log(l + $n - S n y T n )} 

In 

+ f (--) - (--)) log(l + S n - 8 n yl). (C.3) 

\ In 7 / 

We treat the three terms on the right-hand side of (C.3) in turn. First, 

( --L)- ( -i) = ^ 

In 7 7n7 

= fc- 1 / 2 7 - 2 r n + O p (AT 1 ), rwoo. 

Second, <5 n = O p {k~ 1 / 2 ) and therefore also <5 ra = O p {k~ 1 / 2 ) as n — > 00. Hence 
the second term on the right-hand side of (C.3) is 

-T^floga + L - 6 n yZ) - log(l + 5 n - 5 n yl)} 

= {-I' 1 + O p (k-^ 2 )}{5 n - 5 n ytr -$n + « + O^k- 1 )} 

= -k-'/^A^l - yl) + o^k- 1 ' 2 ), n^oo. 
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The third term on the right-hand side of (C.3) is O p (k 1 l 2 )O p (k 1 I 2 ) = 
O p (k~ l ). All in all, we find 

V^(logp n - logp n ) = -Z n + 7 - 2 r n logy - 7 _1 An(l - y T ) + o p {l) (C.4) 

as n — > oo. Combine (C.3) and (C.4) and recall y = g~ 7 and p = 7 r to find 
the result. □ 



Proof of Theorem 5.2 Recall the Weissman estimator in (5 .7) and put 
d n = k/(np n ). From Theorem 4.4.7 in de Haan and Ferreira ( 20061 ). it follows 
that 

— 1 7 T, n — >■ oo. 

logd n \p n J 

Moreover, writing y n = x n /X n _ k:n , 



Pn 

,w 



{l + L- Lvt n }- 1/% = 1 + ^(AT 1 / 2 ), n - oo. 



As logc?„ — * oo, we find that p n and p^" have the same asymptotic distribu- 
tion. □ 
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